NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR

https://doi.org/10.18653/v1/2021.findings-acl.406

Chen, Junkun; Ma, Mingbo; Zheng, Renjie; Huang, Liang (January 2021, Proceedings of ACL 2021: Findings)

Simultaneous speech-to-text translation is widely useful in many scenarios. The conventional cascaded approach uses a pipeline of streaming ASR followed by simultaneous MT, but suffers from error propagation and extra latency. To alleviate these issues, recent efforts attempt to directly translate the source speech into target text simultaneously, but this is much harder due to the combination of two separate tasks. We instead propose a new paradigm with the advantages of both cascaded and endto-end approaches. The key idea is to use two separate, but synchronized, decoders on streaming ASR and direct speech-to-text translation (ST), respectively, and the intermediate results of ASR guide the decoding policy of (but is not fed as input to) ST. During training time, we use multitask learning to jointly learn these two tasks with a shared encoder. En-toDe and En-to-Es experiments on the MuSTC dataset demonstrate that our proposed technique achieves substantially better translation quality at similar levels of latency.
more » « less
Full Text Available
Improving Simultaneous Translation by Incorporating Pseudo-References with Fewer Reorderings

https://doi.org/10.18653/v1/2021.emnlp-main.473

Chen, Junkun; Zheng, Renjie; Kita, Atsuhito; Ma, Mingbo; Huang, Liang (January 2021, Proceedings of EMNLP 2021)

Simultaneous translation is vastly different from full-sentence translation, in the sense that it starts translation before the source sentence ends, with only a few words delay. However, due to the lack of large-scale, high-quality simultaneous translation datasets, most such systems are still trained on conventional full-sentence bitexts. This is far from ideal for the simultaneous scenario due to the abundance of unnecessary long-distance reorderings in those bitexts. We propose a novel method that rewrites the target side of existing full-sentence corpora into simultaneous-style translation. Experiments on Zh→En and Ja→En simultaneous translation show substantial improvements (up to +2.7 BLEU) with the addition of these generated pseudo-references.
more » « less
Full Text Available
Opportunistic Decoding with Timely Correction for Simultaneous Translation

https://doi.org/10.18653/v1/2020.acl-main.42

Zheng, Renjie; Ma, Mingbo; Zheng, Baigong; Liu, Kaibo; Huang, Liang (January 2020, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics)

Simultaneous translation has many important application scenarios and attracts much attention from both academia and industry recently. Most existing frameworks, however, have difficulties in balancing between the translation quality and latency, i.e., the decoding policy is usually either too aggressive or too conservative. We propose an opportunistic decoding technique with timely correction ability, which always (over-)generates a certain mount of extra words at each step to keep the audience on track with the latest information. At the same time, it also corrects, in a timely fashion, the mistakes in the former overgenerated words when observing more source context to ensure high translation quality. Experiments show our technique achieves substantial reduction in latency and up to +3.1 increase in BLEU, with revision rate under 8% in Chinese-to-English and English-to-Chinese translation.
more » « less
Full Text Available
Learning to Stop in Structured Prediction for Neural Machine Translation

Ma, Mingbo; Zheng, Renjie; Huang, Liang (January 2019, Proceedings of NAACL 2019)

Beam search optimization (Wiseman and Rush, 2016) resolves many issues in neural machine translation. However, this method lacks principled stopping criteria and does not learn how to stop during training, and the model naturally prefers longer hypotheses during the testing time in practice since they use the raw score instead of the probability-based score. We propose a novel ranking method which enables an optimal beam search stop- ping criteria. We further introduce a structured prediction loss function which penalizes suboptimal finished candidates produced by beam search during training. Experiments of neural machine translation on both synthetic data and real languages (German→English and Chinese→English) demonstrate our pro- posed methods lead to better length and BLEU score.
more » « less
Full Text Available
Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation

Zheng, Renjie; Ma, Mingbo; Huang, Liang (November 2018, Proceedings of EMNLP 2018)

Full Text Available
Ensemble Sequence Level Training for Multimodal MT: OSU-Baidu WMT18 Multimodal Machine Translation System Report

Zheng, Renjie; Yang, Yilin; Ma, Mingbo; Huang, Liang (November 2018, In Proceedings of WMT 2018)

Full Text Available

Search for: All records